Co-learning of Word Representations and Morpheme Representations
نویسندگان
چکیده
The techniques of using neural networks to learn distributed word representations (i.e., word embeddings) have been used to solve a variety of natural language processing tasks. The recently proposed methods, such as CBOW and Skip-gram, have demonstrated their effectiveness in learning word embeddings based on context information such that the obtained word embeddings can capture both semantic and syntactic relationships between words. However, it is quite challenging to produce high-quality word representations for rare or unknown words due to their insufficient context information. In this paper, we propose to leverage morphological knowledge to address this problem. Particularly, we introduce the morphological knowledge as both additional input representation and auxiliary supervision to the neural network framework. As a result, beyond word representations, the proposed neural network model will produce morpheme representations, which can be further employed to infer the representations of rare or unknown words based on their morphological structure. Experiments on an analogical reasoning task and several word similarity tasks have demonstrated the effectiveness of our method in producing high-quality words embeddings compared with the state-of-the-art methods.
منابع مشابه
Word Representation Models for Morphologically Rich Languages in Neural Machine Translation
Dealing with the co mplex word forms in morphologically rich languages is an open problem in language processing, and is particularly important in translation. In contrast to most modern neural systems of translation, which discard the identity for rare words, in this paper we propose several architectures for learning word representations from character and morpheme level word decompositions. ...
متن کاملWord Type Effects on L2 Word Retrieval and Learning: Homonym versus Synonym Vocabulary Instruction
The purpose of this study was twofold: (a) to assess the retention of two word types (synonyms and homonyms) in the short term memory, and (b) to investigate the effect of these word types on word learning by asking learners to learn their Persian meanings. A total of 73 Iranian language learners studying English translation participated in the study. For the first purpose, 36 freshmen from an ...
متن کاملBetter Word Representations with Recursive Neural Networks for Morphology
Vector-space word representations have been very successful in recent years at improving performance across a variety of NLP tasks. However, common to most existing work, words are regarded as independent entities without any explicit relationship among morphologically related words being modeled. As a result, rare and complex words are often poorly estimated, and all unknown words are represen...
متن کاملUsing functional magnetic resonance imaging (fMRI) to explore brain function: cortical representations of language critical areas
Pre-operative determination of the dominant hemisphere for speech and speech associated sensory and motor regions has been of great interest for the neurological surgeons. This dilemma has been of at most importance, but difficult to achieve, requiring either invasive (Wada test) or non-invasive methods (Brain Mapping). In the present study we have employed functional Magnetic Resonance Imaging...
متن کاملPragmatic Representations in Iranian High School English Textbooks
Owing to the growing interest in communicative, cultural and pragmatic aspects of second language learning in recent years, the present study tried to investigate representations of pragmatic aspects of English as a foreign language in Iranian high school textbooks. Using Halliday’s (1978), and Searle’s (1976) models, different language functions and speech acts were specifically determined and...
متن کامل